NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

AFIDAF: Alternating Fourier and Image Domain Adaptive Filters as an Efficient Alternative to Attention in ViTs

https://doi.org/10.1007/978-3-031-77392-1_2

Zheng, Yunling; Xu, Zeyi; Xue, Fanghui; Yang, Biao; Lyu, Jiancheng; Zhang, Shuai; Qi, Yingyong; Xin, Jack (January 2025, Springer Nature Switzerland)

Full Text Available
Dense Network Expansion for Class Incremental Learning

Hu, Zhiyaun; Li, Yunsheng; Lyu, Jiancheng; Gao, Dashan; Vasconcelos, Nuno (September 2023, IEEE/CVF Conference on Computer Vision and Pattern Recognition)

Full Text Available
Dense Network Expansion for Class Incremental Learning

Hu, Zhiyaun; Li, Yunsheng; Lyu, Jiancheng; Gao, Dashan; Vasconcelos, Nuno (August 2023, IEEE/CVF Conference on Computer Vision and Pattern Recognition)

Full Text Available
AutoShuffleNet: Learning Permutation Matrices via an Exact Lipschitz Continuous Penalty in Deep Convolutional Neural Networks

https://doi.org/https://doi.org/10.1145/3394486.3403103

Lyu, Jiancheng; Zhang, Shuai; Qi, Yingyong; Xin, Jack (August 2020, 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining)

ShuffleNet is a state-of-the-art light weight convolutional neural network architecture. Its basic operations include group, channelwise convolution and channel shuffling. However, channel shuffling is manually designed on empirical grounds. Mathematically, shuffling is a multiplication by a permutation matrix. In this paper, we propose to automate channel shuffling by learning permutation matrices in network training. We introduce an exact Lipschitz continuous non-convex penalty so that it can be incorporated in the stochastic gradient descent to approximate permutation at high precision. Exact permutations are obtained by simple rounding at the end of training and are used in inference. The resulting network, referred to as AutoShuffleNet, achieved improved classification accuracies on data from CIFAR-10, CIFAR-100 and ImageNet while preserving the inference costs of ShuffleNet. In addition, we found experimentally that the standard convex relaxation of permutation matrices into stochastic matrices leads to poor performance. We prove theoretically the exactness (error bounds) in recovering permutation matrices when our penalty function is zero (very small). We present examples of permutation optimization through graph matching and two-layer neural network models where the loss functions are calculated in closed analytical form. In the examples, convex relaxation failed to capture permutations whereas our penalty succeeded.
more » « less
Full Text Available
A Channel-Pruned and Weight-Binarized Convolutional Neural Network for Keyword Spotting

https://doi.org/https://doi.org/10.1007/978-3-030-38364-0_22

Lyu, Jiancheng; Sheen, Spencer (January 2020, Advances in Intelligent Systems and Computing, Springer, Cham 2020.)

We study channel number reduction in combination with weight binarization (1-bit weight precision) to trim a convolutional neural network for a keyword spotting (classification) task. We adopt a group-wise splitting method based on the group Lasso penalty to achieve over 50% channel sparsity while maintaining the network performance within 0.25% accuracy loss. We show an effective three-stage procedure to balance accuracy and sparsity in network training.
more » « less
Full Text Available
Computing Residual Diffusivity by Adaptive Basis Learning via Super-Resolution Deep Neural Networks

https://doi.org/10.1007/978-3-030-38364-0_25

Lyu, Jiancheng; Xin, Jack; Yu, Yifeng (January 2020, The 6th International Conference on Computer Science, Applied Mathematics and Applications)

It is expensive to compute residual diffusivity in chaotic incompressible flows by solving advection-diffusion equation due to the formation of sharp internal layers in the advection dominated regime. Proper orthogonal decomposition (POD) is a classical method to construct a small number of adaptive orthogonal basis vectors for low cost computation based on snapshots of fully resolved solutions at a particular molecular diffusivity D0* . The quality of POD basis deteriorates if it is applied to D0<< D0* . To improve POD, we adapt a super-resolution generative adversarial deep neural network (SRGAN) to train a nonlinear mapping based on snapshot data at two values of D0* . The mapping models the sharpening effect on internal layers as D0 becomes smaller. We show through numerical experiments that after applying such a mapping to snapshots, the prediction accuracy of residual diffusivity improves considerably that of the standard POD.
more » « less
Full Text Available
Channel Pruning for Deep Neural Networks via a Relaxed Groupwise Splitting Method

https://doi.org/https://doi.org/10.1109/AI4I.2019.00032

Yang, Biao; Xin, Jack; Lyu, Jiancheng; Zhang, Shuai; Qi, Yingyong (July 2019, 2019 Second International Conference on Artificial Intelligence for Industries)

A relaxed groupwise splitting method (RGSM) is developed and evaluated for channel pruning of deep neural net- works. Experiments with VGG-16 and ResNet-18 architectures on CIFAR-10/100 image data show that RGSM can achieve much higher channel sparsity than group Lasso method, while keeping comparable accuracy.
more » « less
Full Text Available
A Multistage Backward Differentiable Method for Constructing Light Convolutional Neural Networks

https://doi.org/DOI 10.1109/AI4I.2019.00029

Xue, Fanghui; Xin, Jack; Lyu, Jiancheng; Zhang, Shuai; Qi, Yingyong (July 2019, 2019 Second International Conference on Artificial Intelligence for Industries)

We propose a multistage differentiable method to select convolutional channels and construct light neural networks from a heavy network for inference on a subset of a big data set. The selection proceeds backward in layers and utilizes sparse penalty to diversify channel scores. The resulting light network gains sizable accuracy over the baseline heavy network.
more » « less
Full Text Available

Search for: All records